feat(isolation): add configurable worktree creation timeout#1029
feat(isolation): add configurable worktree creation timeout#1029norbinsh wants to merge 2 commits intocoleam00:devfrom
Conversation
Add `worktree.timeout` option (ms) to `.archon/config.yaml` that flows through the isolation layer to worktree creation execFileAsync calls, replacing the hardcoded 30000ms timeout. Default remains 30000ms for backward compatibility. Fixes repos with heavy post-checkout hooks that exceed 30s. - Add `timeout?: number` to RepoConfig.worktree and WorktreeCreateConfig - Thread timeout through WorktreeProvider creation methods (11 instances) - Keep hardcoded timeouts for removal, listing, and branch cleanup ops Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
📝 WalkthroughWalkthroughThis change makes the git worktree operation timeout configurable. A new optional Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@packages/isolation/src/providers/worktree.ts`:
- Around line 559-560: Validate the user-provided worktree timeout before using
it: check worktreeConfig?.timeout (the value assigned to the local variable
timeout) is a finite positive integer within an acceptable bound (e.g., >0 and
<= some safe max like 5 minutes), throw a clear error if it is missing/invalid,
and use the sanitized value (e.g., validatedTimeout) when building process
options; update the validation in the worktree creation flow in this file
(around the const timeout = worktreeConfig?.timeout ?? 30000 line) so invalid
YAML-cast values are rejected early.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: dad1e612-6b56-475c-be9a-0a87e17e431c
📒 Files selected for processing (3)
packages/core/src/config/config-types.tspackages/isolation/src/providers/worktree.tspackages/isolation/src/types.ts
Guard against invalid YAML-cast values (e.g., strings, negatives, NaN) by checking the type before using the config timeout. Falls back to 30000ms default for any nonsensical value. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
♻️ Duplicate comments (1)
packages/isolation/src/providers/worktree.ts (1)
559-562:⚠️ Potential issue | 🟡 MinorClamp and finite-check
worktree.timeoutbefore using it in process options.This check still accepts
Infinityand unbounded large numbers. Please sanitize to a finite positive integer and cap to a safe upper bound before passing it to git calls.🔧 Proposed hardening
- const timeout = - typeof worktreeConfig?.timeout === 'number' && worktreeConfig.timeout > 0 - ? worktreeConfig.timeout - : 30000; + const configuredTimeout = worktreeConfig?.timeout; + const timeout = + typeof configuredTimeout === 'number' && + Number.isFinite(configuredTimeout) && + configuredTimeout > 0 && + configuredTimeout <= 2_147_483_647 + ? Math.floor(configuredTimeout) + : 30000;In Node.js child_process.execFile options, what are the valid bounds/behavior for `timeout`, specifically for `Infinity` and values greater than 2^31-1 milliseconds?🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/isolation/src/providers/worktree.ts` around lines 559 - 562, The computed timeout value from worktreeConfig?.timeout may be Infinity or exceed platform limits; change the logic around the timeout variable so you first verify Number.isFinite(worktreeConfig.timeout) and that it is > 0, coerce to an integer (e.g., Math.floor), then clamp it to a safe upper bound such as MAX_TIMEOUT = 2**31 - 1 (2147483647) before using timeout in process options for git calls; update references to timeout (the const defined from worktreeConfig?.timeout) to use this sanitized value.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@packages/isolation/src/providers/worktree.ts`:
- Around line 559-562: The computed timeout value from worktreeConfig?.timeout
may be Infinity or exceed platform limits; change the logic around the timeout
variable so you first verify Number.isFinite(worktreeConfig.timeout) and that it
is > 0, coerce to an integer (e.g., Math.floor), then clamp it to a safe upper
bound such as MAX_TIMEOUT = 2**31 - 1 (2147483647) before using timeout in
process options for git calls; update references to timeout (the const defined
from worktreeConfig?.timeout) to use this sanitized value.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: a203ea52-7480-4aad-850a-077746e814c6
📒 Files selected for processing (1)
packages/isolation/src/providers/worktree.ts
…nd subprocesses (coleam00#1030) * Fix: prevent target repo .env leakage into Claude subprocess (coleam00#1029) Bun auto-loads CWD .env before user code runs. When archon runs from a target repo, that repo's ANTHROPIC_API_KEY (and other secrets) leaked into process.env and were passed through to the Claude Code subprocess, billing the wrong account. Changes: - Add SUBPROCESS_ENV_ALLOWLIST + buildCleanSubprocessEnv() utility - claude.ts buildSubprocessEnv now starts from allowlist, not process.env - CLI and server entry points strip all keys parsed from CWD .env on startup (replaces DATABASE_URL-only patch) - Update claude.test.ts to assert ANTHROPIC_API_KEY no longer reaches subprocess - Add env-allowlist tests Fixes coleam00#1029 * fix: address review findings for env allowlist PR - Register env-allowlist.test.ts in @archon/core test batch so security tests run in CI - Remove CLAUDECODE, NODE_OPTIONS, VSCODE_INSPECTOR_OPTIONS from allowlist (they are always stripped by caller — listing them was semantically contradictory) - Add intent comment to silent dotenv parse fallback in cli.ts and server/index.ts - Add test for useGlobalAuth=false path verifying ANTHROPIC_API_KEY excluded from subprocess env - Update security.md to document full CWD .env isolation and subprocess allowlist (was only describing DATABASE_URL)
…nd subprocesses (coleam00#1030) * Fix: prevent target repo .env leakage into Claude subprocess (coleam00#1029) Bun auto-loads CWD .env before user code runs. When archon runs from a target repo, that repo's ANTHROPIC_API_KEY (and other secrets) leaked into process.env and were passed through to the Claude Code subprocess, billing the wrong account. Changes: - Add SUBPROCESS_ENV_ALLOWLIST + buildCleanSubprocessEnv() utility - claude.ts buildSubprocessEnv now starts from allowlist, not process.env - CLI and server entry points strip all keys parsed from CWD .env on startup (replaces DATABASE_URL-only patch) - Update claude.test.ts to assert ANTHROPIC_API_KEY no longer reaches subprocess - Add env-allowlist tests Fixes coleam00#1029 * fix: address review findings for env allowlist PR - Register env-allowlist.test.ts in @archon/core test batch so security tests run in CI - Remove CLAUDECODE, NODE_OPTIONS, VSCODE_INSPECTOR_OPTIONS from allowlist (they are always stripped by caller — listing them was semantically contradictory) - Add intent comment to silent dotenv parse fallback in cli.ts and server/index.ts - Add test for useGlobalAuth=false path verifying ANTHROPIC_API_KEY excluded from subprocess env - Update security.md to document full CWD .env isolation and subprocess allowlist (was only describing DATABASE_URL)
|
This PR appears to fully address #1119. Consider adding |
|
Thanks for the careful scoping and the detailed writeup, @norbinsh — the underlying problem (30s is too tight when repos have heavy post-checkout hooks) is real, and your PR body made it easy to evaluate. After thinking about it though, I don't think a new config key is the right primitive here. A few reasons:
I'm going to close this in favor of #XXXX, which consolidates all 15 sites onto a single Really appreciate you surfacing the problem and writing up #1119. Crediting you in the replacement PR. |
All 15 worktree git-subprocess timeouts in WorktreeProvider were hardcoded at 30000ms. Repos with heavy post-checkout hooks (lint, dependency install, submodule init) routinely exceed that budget and fail worktree creation. Consolidate them onto a single GIT_OPERATION_TIMEOUT_MS constant at 5 min. Generous enough to cover reported cases while still catching genuine hangs (credential prompts in non-TTY, stalled fetches). Chosen over the config-key approach in #1029 to avoid adding permanent .archon/config.yaml surface for a problem a raised default solves cleanly. If 5 min turns out to also be too tight for real-world use, we'll revisit. Closes #1119 Supersedes #1029 Co-authored-by: Shay Elmualem <12733941+norbinsh@users.noreply.github.com>
All 15 worktree git-subprocess timeouts in WorktreeProvider were hardcoded at 30000ms. Repos with heavy post-checkout hooks (lint, dependency install, submodule init) routinely exceed that budget and fail worktree creation. Consolidate them onto a single GIT_OPERATION_TIMEOUT_MS constant at 5 min. Generous enough to cover reported cases while still catching genuine hangs (credential prompts in non-TTY, stalled fetches). Chosen over the config-key approach in #1029 to avoid adding permanent .archon/config.yaml surface for a problem a raised default solves cleanly. If 5 min turns out to also be too tight for real-world use, we'll revisit. Closes #1119 Supersedes #1029 Co-authored-by: Shay Elmualem <12733941+norbinsh@users.noreply.github.com>
…nd subprocesses (coleam00#1030) * Fix: prevent target repo .env leakage into Claude subprocess (coleam00#1029) Bun auto-loads CWD .env before user code runs. When archon runs from a target repo, that repo's ANTHROPIC_API_KEY (and other secrets) leaked into process.env and were passed through to the Claude Code subprocess, billing the wrong account. Changes: - Add SUBPROCESS_ENV_ALLOWLIST + buildCleanSubprocessEnv() utility - claude.ts buildSubprocessEnv now starts from allowlist, not process.env - CLI and server entry points strip all keys parsed from CWD .env on startup (replaces DATABASE_URL-only patch) - Update claude.test.ts to assert ANTHROPIC_API_KEY no longer reaches subprocess - Add env-allowlist tests Fixes coleam00#1029 * fix: address review findings for env allowlist PR - Register env-allowlist.test.ts in @archon/core test batch so security tests run in CI - Remove CLAUDECODE, NODE_OPTIONS, VSCODE_INSPECTOR_OPTIONS from allowlist (they are always stripped by caller — listing them was semantically contradictory) - Add intent comment to silent dotenv parse fallback in cli.ts and server/index.ts - Add test for useGlobalAuth=false path verifying ANTHROPIC_API_KEY excluded from subprocess env - Update security.md to document full CWD .env isolation and subprocess allowlist (was only describing DATABASE_URL)
Summary
worktree.timeoutconfig option (ms) that flows through the isolation layer to all worktree creationexecFileAsynccalls, with 30000ms default for backward compatibility.@archon/gitpackage,MergedConfig, orGlobalConfig.UX Journey
Before
After
Architecture Diagram
Before
After
Connection inventory:
RepoConfig.worktreeWorktreeCreateConfigtimeout?: numberWorktreeCreateConfigWorktreeProvider.createWorktreeProvider→execFileAsync(creation)WorktreeProvider→execFileAsync(removal/list)Label Snapshot
risk: lowsize: Sisolation,configisolation:worktree-provider,core:config-typesChange Metadata
featureisolationLinked Issue
N/A — Feature request from workflow usage observation.
Validation Evidence (required)
bun run validate # ✅ All pass@archon/webfailure (missingmdast-util-gfmdep, identical ondev)Security Impact (required)
NoNoNoNoCompatibility / Migration
Yes— default remains 30000ms when not configuredYes— new optionalworktree.timeoutin.archon/config.yamlNoHuman Verification (required)
bun run validatepasses, diff reviewed for all 15 timeout instances (11 updated, 4 kept)Side Effects / Blast Radius (required)
@archon/isolationRollback Plan (required)
Risks and Mitigations
Summary by CodeRabbit